Goto

Collaborating Authors

 fast distribution learning algorithm



Review for NeurIPS paper: SURF: A Simple, Universal, Robust, Fast Distribution Learning Algorithm

Neural Information Processing Systems

The paper develops a simple approximation method for one-dimensional densities using piecewise polynomials, which adapts to the number of pieces. While all the reviewers are borderline positive, they bring up a few issues regarding both the theory and the experiments (and their general opinion has not changed after the rebuttal).


SURF: A Simple, Universal, Robust, Fast Distribution Learning Algorithm

Neural Information Processing Systems

Sample- and computationally-efficient distribution estimation is a fundamental tenet in statistics and machine learning. In experiments, \SURF outperforms state-of-the art algorithms.


TURF: A Two-factor, Universal, Robust, Fast Distribution Learning Algorithm

Hao, Yi, Jain, Ayush, Orlitsky, Alon, Ravindrakumar, Vaishakh

arXiv.org Machine Learning

Approximating distributions from their samples is a canonical statistical-learning problem. One of its most powerful and successful modalities approximates every distribution to an $\ell_1$ distance essentially at most a constant times larger than its closest $t$-piece degree-$d$ polynomial, where $t\ge1$ and $d\ge0$. Letting $c_{t,d}$ denote the smallest such factor, clearly $c_{1,0}=1$, and it can be shown that $c_{t,d}\ge 2$ for all other $t$ and $d$. Yet current computationally efficient algorithms show only $c_{t,1}\le 2.25$ and the bound rises quickly to $c_{t,d}\le 3$ for $d\ge 9$. We derive a near-linear-time and essentially sample-optimal estimator that establishes $c_{t,d}=2$ for all $(t,d)\ne(1,0)$. Additionally, for many practical distributions, the lowest approximation distance is achieved by polynomials with vastly varying number of pieces. We provide a method that estimates this number near-optimally, hence helps approach the best possible approximation. Experiments combining the two techniques confirm improved performance over existing methodologies.

  artificial intelligence, fast distribution learning algorithm, machine learning, (4 more...)
2202.07172
  Genre: Research Report (0.40)